Lecture 5
UW Course Evaluations via IASystem Notification about this
course’s midterm course feedback. It would mean a lot to me if you filed
it outSuppose your teammate gave you a function to find the maximal clique in an adjacency matrix (i.e., the set of nodes that forms the largest clique). You are not told the typical size and characteristics of these adjacency matrices beforehand. Your job is to make sure this function is correct since you and your teammate are about to give this function to your manager, who will then give it to another division in your company to use. Your performance review will depend highly on whether or not other people in your company can reliably use your function.
In a short paragraph, write down ways to ensure your teammate’s function is “correct.” Please list at least four different ways you can test this function. You can interpret this notion of “correct” very liberally – this question is purposely framed to be open-ended.
Checking that the function outputs something that is the correct type
Simple checks to make sure result is within the correct range
Making sure the function runs on many different inputs
Making sure the function gets the correct answer for carefully crafted problem
generate_random_graph function, which generates an
adjacency matrix for which we know the exact size of the
clique.”Making sure the function gets the correct answer with the help of visualization
generate_random_graph() and to visualize
the maximal clique by reversing the order of the nodes with the function
rev_order().”Making sure the function gets the correct answer with many randomly generated problems
generate_random_graph() … Repeat the
above process for 50 different matrices.”Stress testing to make sure the function handles corner cases gracefully
clique_fraction is 0 or 1, or when n is 0 or
1, and see if the function has handles the edge cases properly.”Testing the behavior when there is deliberately no unique answer
Comparing against another known implementation
Comparing against another one of your implementations (possibly much more computationally intensive, but more transparent and definitely correct)
Math: Exploiting some mathematical property of your problem
adj_mat by rows and columns. We can
test this property to make sure it work properly.”Testing the timing of your function
adj_matrix with a very large n, we
should test the performance of our function to see whether it give us a
satisfactory answer within a acceptable time window. This is not really
about testing the right answer of our function output, but since we
already know this problem is NP-hard then we should take efficiency as
part of our definition of”correct function”.”Testing to make sure it errors when expected
Testing the default values
Testing the coding environment
Checking the documentation/literal coding
help() in R to make sure the
description and the code is with the same logic.”Checking the intermediary functions!
You ideally should be writing unit tests for each function you write. The more you “refactor” your functions (into more manageable smaller functions), the easier you’ll know what tests to write.
R Studio and R packages are convenient for easily running all your unit tests, so you can easily assess how “stable” your codebase is.
This can be painful!! Whenever you update your method, you might need to update your unit tests. However, this should be a “rolling experience” – as you find new bugs in your code, you should be writing more unit tests.
In my experience, most PhD students I talk to cannot be bothered to write unit tests since it’s a pain to write/maintain all this “additional code.”
At the same time, in my experience, it’s not a matter of “whether or not my codebase crashes,” but rather a question of “what time in the future will my codebase become so complicated that everything starts to fail at the same time?”
In my experience, most PhD students I talk to cannot be bothered to write unit tests since it’s a pain to write/maintain all this “additional code.”
At the same time, in my experience, it’s not a matter of “whether or not my codebase crashes,” but rather a question of “what time in the future will my codebase become so complicated that everything starts to fail at the same time?”
After you’ve had your first traumatic experience of your codebase failing, come back to these slides and learn more about how to write unit-tests so you don’t need to relive the trauma in the future.
tests folder:teststestthat and a
testthat.R
testthat
and a .R file called testthat.R.DS_Store file)testthat.R folder, you’ll see that
it’ll look like this
testthat.R file you ever write will only have
these specifically these three lines.UW561S2024Example in both places
with your package name (for your homeworks, this would be
UWBiost561)See this in:
testthat folder: https://github.com/linnykos/561_s2024_example/tree/main/tests/testthat.R file in the
testthat folder for every .R file in the
R folder (where your functions live)R folder) and its corresponding tests
(in the testthat folder)Let’s look into the unit tests in
test_compute_probabilities.R: https://github.com/linnykos/561_s2024_example/blob/main/tests/testthat/test_compute_probabilities.R
.R file in the
testthat folder is the context
.R file?.R file in the R folder]”context("Testing compute_probabilities")
# Unit test for compute_log_probabilities
test_that("compute_probabilities outputs correctly", {
set.seed(10)
# Mock data and parameters for testing
data <- matrix(rnorm(20), nrow = 10, ncol = 2) # 10 samples, 2-dimensional
means <- matrix(c(0, 0, 5, 5), nrow = 2, byrow = TRUE) # 2 components
variances <- c(1, 2)
proportions <- c(0.5, 0.5)
probabilities <- compute_probabilities(data, means, variances, proportions)
# Test if probabilities sum to 1 for each sample
expect_true(all(abs(rowSums(probabilities) - 1) < 1e-6))
# Test if probabilities are within the valid range [0,1]
expect_true(all(probabilities >= 0 & probabilities <= 1))
# Test for handling of a single sample (edge case)
single_sample <- data[1, , drop = FALSE] # Prevent dropping to lower dimension
probabilities_single <- compute_probabilities(single_sample,
means,
variances,
proportions)
expect_true(dim(probabilities_single)[1] == 1)
expect_true(all(abs(rowSums(probabilities_single) - 1) < 1e-6))
})
test_that()
function.
{} block.context("Testing compute_probabilities")
# Unit test for compute_log_probabilities
test_that("compute_probabilities outputs correctly", {
set.seed(10)
# Mock data and parameters for testing
data <- matrix(rnorm(20), nrow = 10, ncol = 2) # 10 samples, 2-dimensional
means <- matrix(c(0, 0, 5, 5), nrow = 2, byrow = TRUE) # 2 components
variances <- c(1, 2)
proportions <- c(0.5, 0.5)
probabilities <- compute_probabilities(data, means, variances, proportions)
# Test if probabilities sum to 1 for each sample
expect_true(all(abs(rowSums(probabilities) - 1) < 1e-6))
# Test if probabilities are within the valid range [0,1]
expect_true(all(probabilities >= 0 & probabilities <= 1))
# Test for handling of a single sample (edge case)
single_sample <- data[1, , drop = FALSE] # Prevent dropping to lower dimension
probabilities_single <- compute_probabilities(single_sample,
means,
variances,
proportions)
expect_true(dim(probabilities_single)[1] == 1)
expect_true(all(abs(rowSums(probabilities_single) - 1) < 1e-6))
})
context("Testing compute_probabilities")
# Unit test for compute_log_probabilities
test_that("compute_probabilities outputs correctly", {
set.seed(10)
# Mock data and parameters for testing
data <- matrix(rnorm(20), nrow = 10, ncol = 2) # 10 samples, 2-dimensional
means <- matrix(c(0, 0, 5, 5), nrow = 2, byrow = TRUE) # 2 components
variances <- c(1, 2)
proportions <- c(0.5, 0.5)
probabilities <- compute_probabilities(data, means, variances, proportions)
# Test if probabilities sum to 1 for each sample
expect_true(all(abs(rowSums(probabilities) - 1) < 1e-6))
# Test if probabilities are within the valid range [0,1]
expect_true(all(probabilities >= 0 & probabilities <= 1))
# Test for handling of a single sample (edge case)
single_sample <- data[1, , drop = FALSE] # Prevent dropping to lower dimension
probabilities_single <- compute_probabilities(single_sample,
means,
variances,
proportions)
expect_true(dim(probabilities_single)[1] == 1)
expect_true(all(abs(rowSums(probabilities_single) - 1) < 1e-6))
})
compute_probabilities, we need to actually use the
compute_probabilities function.
probabilitiesprobabilities is what
we expectcontext("Testing compute_probabilities")
# Unit test for compute_log_probabilities
test_that("compute_probabilities outputs correctly", {
set.seed(10)
# Mock data and parameters for testing
data <- matrix(rnorm(20), nrow = 10, ncol = 2) # 10 samples, 2-dimensional
means <- matrix(c(0, 0, 5, 5), nrow = 2, byrow = TRUE) # 2 components
variances <- c(1, 2)
proportions <- c(0.5, 0.5)
probabilities <- compute_probabilities(data, means, variances, proportions)
# Test if probabilities sum to 1 for each sample
expect_true(all(abs(rowSums(probabilities) - 1) < 1e-6))
# Test if probabilities are within the valid range [0,1]
expect_true(all(probabilities >= 0 & probabilities <= 1))
# Test for handling of a single sample (edge case)
single_sample <- data[1, , drop = FALSE] # Prevent dropping to lower dimension
probabilities_single <- compute_probabilities(single_sample,
means,
variances,
proportions)
expect_true(dim(probabilities_single)[1] == 1)
expect_true(all(abs(rowSums(probabilities_single) - 1) < 1e-6))
})
expect_true() to let R
know that you expect a particular statement to be true.
expect_true() is itself code that will be evaluatedTRUE boolean, then the test will
pass (but I have not yet showed you how these tests are run!)context("Testing compute_probabilities")
# Unit test for compute_log_probabilities
test_that("compute_probabilities outputs correctly", {
set.seed(10)
# Mock data and parameters for testing
data <- matrix(rnorm(20), nrow = 10, ncol = 2) # 10 samples, 2-dimensional
means <- matrix(c(0, 0, 5, 5), nrow = 2, byrow = TRUE) # 2 components
variances <- c(1, 2)
proportions <- c(0.5, 0.5)
probabilities <- compute_probabilities(data, means, variances, proportions)
# Test if probabilities sum to 1 for each sample
expect_true(all(abs(rowSums(probabilities) - 1) < 1e-6))
# Test if probabilities are within the valid range [0,1]
expect_true(all(probabilities >= 0 & probabilities <= 1))
# Test for handling of a single sample (edge case)
single_sample <- data[1, , drop = FALSE] # Prevent dropping to lower dimension
probabilities_single <- compute_probabilities(single_sample,
means,
variances,
proportions)
expect_true(dim(probabilities_single)[1] == 1)
expect_true(all(abs(rowSums(probabilities_single) - 1) < 1e-6))
})
devtools::check()devtools::check() in the R console. (Technically, if you
only want to run your tests and nothing else, you can
use devtools::test().)Build..Rproj file is in the top folder of your R package. That
is, your files should look something like thisdevtools::check(), this happens. (Essentially, R doesn’t
know what R package you’re trying to test.)UW561S2024Example
packagedevtools::test() and
devtools::check() – what the results look like when a test
failsdevtools::check() looks atIt’s often useful to write code inside your
function (i.e., not in the testthat folder, but in
the R folder)
This is often done by using either the stopifnot()
function or the stop() function
cleanup_na_matrix <- function(mat){
stopifnot(is.matrix(mat), all(is.numeric(mat)))
n <- nrow(mat)
p <- ncol(mat)
mat <- sapply(1:p, function(j){
.cleanup_vector(mat[,j])
})
return(mat)
}
Alternative:
cleanup_na_matrix <- function(mat){
if(!is.matrix(mat) | !is.numeric(mat))
stop("mat is not a numeric matrix")
n <- nrow(mat)
p <- ncol(mat)
mat <- sapply(1:p, function(j){
.cleanup_vector(mat[,j])
})
return(mat)
}
alpha gets smaller
alpha=0.5 (i.e., 50% confidence interval) should be much
smaller than the intervals for alpha=0.01 (i.e., 99%
confidence interlva) for the vast majority of instancesTesting for completeness:
Testing for correctness:
for loop that (mathematically)
does the same thing but is very slowElements #8 through #11 could all benefit from randomly generated inputs since you don’t really need to know what the exact output is to write a useful test!
tiltedCCA (Mine): https://github.com/linnykos/tiltedCCA/tree/master/tests/testthatradEmu (by Amy): https://github.com/adw96/radEmu/tree/main/tests/testthatggplot2: https://github.com/tidyverse/ggplot2/tree/main/tests/testthattestthat: https://vita.had.co.nz/papers/testthat.pdf